-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tableau): ability to force extraction of table/column level linage from SQL queries #9838
Conversation
…ge from SQL queries
@alexs-101 thanks for the PR! Before diving into the code, I wanted to make sure I understood the motivation behind each feature. Btw - it makes it a lot easier to review if things are split up into multiple PRs Things that make sense to me
Things I'd like to discuss
|
One other thing - it seems like the t-sql changes only make sense when |
@hsheth2 big thanx for quick feedback! I will answer (or fix) everything tomorrow morning with a fresh mind. |
it's a very helpful feature for those who has ability to ingest Tableau only and don't have access to other metadata sources. cold we keep it?
yep |
@hsheth2 finished, could you start tests again, please? |
Seems likt this test failed
I checked, it can not be due to my changes |
@hsheth2 i pulled in latest changes from master, could you please start tests again? |
metadata-ingestion/src/datahub/ingestion/source/tableau_common.py
Outdated
Show resolved
Hide resolved
@hsheth2 I checked failed tests and all of them look like failed NOT due to changes brought by the PR
I will fix it, if you say - fix it. :) |
@alexs-101 @egemenberk I agree those failures look unrelated, and I believe have been fixed on the master branch. I just synced this branch with master, and hopefully that unblocks CI |
found bug #10357 when the force_extraction_of_lineage_from_custom_sql_queries property is not leveraged and the error is thrown |
…ge from SQL queries (datahub-project#9838)
Related issues
Everything was fixed in backward compatible way, should not break anything
Closes #9841
Closes #9842
Closes #9843
Closes #9844
Closes #9845
Closes #9846
Closes #9847
Closes #9880
Closes #9881
Features
Details
force_extraction_of_lineage_from_custom_sql_queries: true | false
If enabled, the tableau source plugin will not rely on Tableau Custom SQL metadata, instead, it will always parse out table and column level linage from SQL queries (does not matter if it's supported and unsupported by Tableau). The feature was implemented, because in our case the quality of Custom SQL Tableau metadata (even for supported SQL queries) was worse than what sqlglot based parser does. It allowed us to finally ingest our huge Tableau projects and be happy enough with quality of the resulting linage.
disable_schema_awarenes_during_parsing_of_sql_queries: true | false
If enabled, the SQL parser will not lookup pre ingested schemas of tables from DataHub during parsing, it's helpful if the Tableau projects are old, some of projects are abandoned and Custom SQL queries are out of sync with real table schemas, in this case SQL parser will be failing silently on attempt to resolve columns referenced by SQL queries. The feature will allow to ingest such a Tableau swamp and have at least table level linage and mostly column level linage (if columns are prefixed with aliases and * is not used).
How to use
Also, the PR contains some bug fixes for issues that were found by us during ingestion of metadata from Tableau
Checklist